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(57) Abstract 



Hie pr^ent invention relates to the use of a mammary specific a-lactalbumin protein to assist in the production of recom- 
bmant proteins in mammals' milk. The invention also relates to the genetically engineered mammal that produces the desired rec- 
ombinant product in its milk and to the products produced by the genetically engineered mammal, including the altered composi- 
tion of milk and the semen which includes the desired a-lactalbumin protein DNA sequence. 
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DNA SEQUENCE ENCODING BOVXNE a-LACTALBUKIN 
AND METHODS OF USE 

FIELD OF THE INVENTION 
The present invention relates generally to a 
5 DNA sequence encoding bovine a-lactalbuiain and to methods 

of producing proteins including recombinant proteins in 
the milk of lactating genetically engineered or 
transgenic mammals. The present invention relates also 
to genetically engineered or transgenic mammals that 
10 secrete the recombinant protein. The present invention 

is also directed to a genetic marker for identifying 
animals with superior milk producing characteristics. 

REFERENCE TO CITED ART 
Reference is made to the section preceding 
15 the CLAIMS for a full bibliography citation of the art 

cited herein. 

DESCRIPTION OF THE PRIOR ART 
a-Lact albumin is a major whey protein found 
in cow's milk f Eiael et al . . 1984). The term "whey 

20 protein" includes a group of milk proteins that remain 

soluble in "milk serum" or whey after the precipitation 
of casein, another milk protein, at pH 4.6 and 20®C. a- 
Lactalbumin has these characteristics. 

a-Lactalbumin is a secretory protein that 

25 normally comprises about 2.5% of the total protein in 

milk. a-Lactalbumin has been used as an index of mammary 
gland function in response to hormonal regulation in 
bovine explant culture f Akers et al . . 1981; Goodman et 
al. . 1983) and as an index of udder development (McFadden 

30 et al. . 1986) . a*Lactalbumin interacts with galactosyl 

transferase and therefore plays an essential role in the 
biosynthesis of milk sugar lactose f Brew> K. and R.L. 
Hill . 1975) • Lactose is an important component: in milk, 
and contributes to milk osmolality. It is the most 

35 constant constituent in cow's milk ^ Larson > 1985) . a- 

Iiactalbumin is useful as an index of lactogenesis in 
cultured mammary tissue f McFadden et al. , 1987) . It is 
thex-efore- believed that a-Lactalbumin is an important 
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protein in controlling milk yield and can be used as an 
indicator of mammary function. 

The expression of bovine a-lactalbumin may be 
a potential rate limiting process in dairy cattle. If 
5 greater expression of the Q-lactalbumin gene can be 

obtained, then more milk and milk protein could be 
produced. In other words, a-lactalbumin is a potential 
Quantitative Trait Locus (QTL) . 

SUMMARY OF THE INVENTION 
10 One object of the present invention is to 

detect possible genetic differences in the expression of 
bovine a-lactalbumin . 

Another object of the present invention is to 
provide a DNA sequence encoding a mammary specific bovine 
15 a-lactalbumin protein having a specified nucleotide 

sequence* 

It is also an object of the present invention 
to provide a method for genetically engineering the 
incorporation of one or more copies of a construct 
20 comprising an a-lactalbumin control region, which 

construct is specifically activated in the mammary 
tissue . 

These objects and others are addressed by the 
present invention, which is directed to a DNA sequence 
25 encoding bovine a-lactalbumin having a specified 

nucleotide sequence . 

The present invention is also directed to an 
expression vector comprising this DNA sequence. Further, 
the present invention is directed to the protein a- 
30 lactalbumin having the nucleotide sequence. 

The present invention is also directed to an 
expression system comprising a mammary specific a- 
lactalbumin control region which, when genetically 
incorporated into a mammal, permits the female species of 
35 that mammal to produce the desired recombinant protein in 

its milk. 
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The present invention is also directed to a 
genetically engineered or transgenic maminal comprising 
the specified DNA sequence encoding bovine a-lactalbumin. 

The present invention is also directed to a 
5 DNA sequence coding for a-lactalbumin, which is 

operatively linked to an expression system coding for a 
mammary-specific a-lactalbumin protein control, or any 
control region which specifically activates a-lactalbumin 
in milk or in mammary tissue, through a DNA sequence 
10 coding for a signal peptide that permits secretion and 

maturation of the a-lactalbumin in the mammary tissue* 

The present invention is also directed to a ( 
process for genetically engineering the incorporation of 
one or more copies of a construct comprising an a- 
15 lactalbumin control region which specifically activates 

a-lactalbumin in milk or in mammary tissue. The control 
region is operatively linked to a DNA sequence coding for 
a desired recombinant protein through a DNA sequence 
coding for a signal peptide that permits the secretion / 
20 and maturation of a-lactalbumin in the mammary tissue* 

The present invention is also directed to a 
process for the production and secretion into a mammal's 
milk of an exogenous recombinant protein • The steps 
include producing milk in a genetically engineered or 
25 transgenic mammal. The milk is characterized by an 

expression system comprising a-lactalbumin control 
region. The control region is operatively linked to an 
exogenous DNA sequence coding for the recombinant protein 
through a DNA sequence coding for a signal for the 
30 peptide effective in secreting and maturing the 

recombinant protein in mammary tissue. The milk is then 
collected for use. Alternatively, the exogenous 
recombinant protein is isolated from the milk. 

The present invention is also directed to a 
35 selection characteristic for identifying superior milk 

and milk protein producing animals comprising a DNA 
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sequence encoding bovine a-lactalbumin and having a 
specified nucleotide sequence. 

The present invention is also directed to a 
selection characteristic for identifying superior milk 
5 and milk protein producing mammals. The mammals are 

characterized by inherited genetic material in the DNA 
structure of the mammal. The genetic material encodes at 
least one desired dominant selectable marker for bovine 
a-lactalbumin. One such marker is adenosine, which is 

10 located at the -13 position on the control region of the 

DNA sequence for a-lactalbumin. The present invention is 
also directed to a method of predicting superior milk and 
milk protein production in animals comprising identifying 
the selection characteristic discussed above. 

15 The present invention is further directed to 

a method -for modifying the milk composition in mammals 
which comprises inserting a DNA sequence encoding bovine 
a-lactalbumin having a specified nucleotide sequence- 

The DNA sequence and the various methods of 

20 using it have potentially beneficial uses for dairy 

farmers, artificial insemination organizations, genetic 
marker companies, and embryo transfer and cloning 
companies, to name a few. 

The uses for this genetic marker include the 

25 identification of superior nuclear transfer embryos and 

the identification of superior embryos to clone. 

The present invention also will aid in the 
progeny testing of sires. The specified DNA sequence can 
be used as a genetic marker to identify possible elite 

30 sires in terms of milk production and milk protein 

production. This will increase the reliability of buying 
superior dairy cattle. 

The present invention also will provide 
assistance in farm management decisions, such as sire 

35 selection and selective culling.* The physiological 

markers assist in determining future production 
performance in addition to a cow's pedigree. From this 
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information, one could buy or retain a heifer with a DNA 
sequence encoding a-lactalbumin of the present invention 
and consider culling a heifer without the proper 
sequence. 

-5, BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a schematic illustration of a 
partial restriction map of the bovine a-lactalbumin of 
the present invention. The sequence contains 2,0 
kilobases of a 5 • flanking region^ 1.7 kilobases of a 
10 coding region and 8.8 kilobases of a 3 » flanking region. 

Digestion with the Hpa I yields a 2.8 kilobase fragment 
containing the whole 5* flanking region. 

Fig. 2 depicts in schematic outline a map of 
the plasmid A-lac Pro/pIC 20R. A Hpa I fragment of the 
15 genomic clone was inserted into the EcoRV site of pIC 

20R. The Hpa I fragment contains 2.1 kb of 5' flanking 
DNA the signal peptide coding region of a-lactalbumin and 
8 bases encoding the mature a-lactalbumin protein. Six 
unique enzyme sites are available for attaching various 
20 genes to the sequence. 

Fig. 3 is a schematic illustration of a 
7 detailed map of the a-lactalbumin 5' flanking control 

region cloned in EcoRV site of the plasmid pIC 20R (SEQ 
ID NO:l, SEQ ID NO:2). 
25 Fig. 4 is a schematic illustration of a 

detailed map of the 8.0 kilobase Bglll fragment. 

Fig. 5 depicts the nucleotide sequence (SEQ 
ID NO: 3) of the control /enhancer region of the bovine a- 
lactalbumin protein, 
30 Fig. 6 depicts in schematic outline a map of 

a plasmid containing bovine a-lactalbumin-bovine B-casein 
gene construct. 

Fig. 7 illustrates a sequence comparison 
between humans and bovine genes in the 5* flanking region 
35 of the bovine a-lactalbximin protein between the present 

invention U. S. bovine sequence (SEQ ID NO: 4)^ a human 
sequence (SEQ ID NO: 5) and the French bovine (SEQ ID 
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NO:6) for the putative steroid response element and 
between the present invention U, S. bovine sequence (SEQ 
ID NO: 7), a human sequence (SEQ ID NO: 8) and the French 
bovine (SEQ ID NO: 9) for the RNA polymerase binding 
5 region, surrounding three of the four nucleotide sequence 

variant mutations. 

Fig. 8 is a DOTPLOT* graph comparing the 
bovine Q-lactalbumin 5" flanking sequence to the same 
region of the human or-lactalbumin sequence. 
^0 Fig. 9 is a DOTPLOT~ graph comparing the 

bovine a-lactalbumin 5' flanking sequence to the same 
region of the guinea pig a-lactalbumin sequence. 

Fig. 10 is a DOTPLOT™ graph comparing the 
bovine a-lactalbumin 5' flanking sequence to the same 
15 region of the rat a-lactalbumin sequence. 

Fig. 11 is a graph illustrating expression 
levels observed in each of three a-lactalbumin transgenic 
mouse line. 

Fig. 12 is a 4% NuSieve autoradiographic gel 
20 of Mnll digested PCR products. 

Fig. 13 is a graph illustrating a scatter 
plot of each data point in Fig. 12 as well as mean values 
for each of the three genotypes. 

DETAIL DESCRIPTION OF THE PREFERRED INVENTION 
25 In the Description the following terms are 

employed: 

Genetic engineering, manipulation or 
modification: the formation of new combinations of 
materials by the insertion of nucleic acid molecules 

30 produced outside the cell into any virus, bacterial 

plasmid or other vector system so as to allow their 
incorporation into a host organism in which they do not 
naturally occur, but in which they are capable of 
continued propagation at least throughout the life of the 

35 host organism. Although the term incorporates transgenic 

alteration, the manipulation of the genomic sequence does 
not have to be permanent, i. e. , the genetic engineering 
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can affect only the animal which was directly 
manipulated. 

Transgenic animals: permanently genetically 
engineered animals created by introducing new DNA 
5 sequences into the germ line via addition to the egg. 

It is within the scope of the present 
application to use any mammal for the invention. 
Examples of mammals include cows, sheep, goats, mice, 
oxen, camels, water buffaloes, llamas and pigs. 

10 Preferred mammals include those that produce large 

volumes of milk and have long lactating periods. 

The present invention is directed to a gene 
which encodes bovine a-lactalbumin. This gene has been 
isolated and characterized. The 5' flanking region of 

15 the gene has been cloned into six vectors for use as a 

mammary specific control region in the production of 
genetically engineered mammals. To better understand the 
regulation of this control region, 2.0 kilobases of the 
5» flanking sequence have been sequenced. The a- 

20 lactalbumin 5' flanking sequence serves as a useful 

mammary-specific "control/enhancer complex" for 
engineering genetic constructs that could be capable of 
driving the expression of novel and useful proteins in 
the milk of genetically engineered or transgenic mammals. 

25 This results in an increase in milk production and the 

protein composition in milk, a change in the milk and/ or 
protein composition in milk, and the production of 
valuable proteins in the milk of genetically engineered 
or transgenic mammals. Such proteins include insulin, 

30 growth hormone, growth hormone releasing factor, 

somatostatin, tissue plasminogen activator, tumor 
necrosis factor, lipocortin, coagulation factors VIII and 
IX, the interferons, colony stimulating factor, the 
inter lukens, urokinise, industrial enzymes such as 

35 cellulases, hemicellulases, peroxidases, and thermal 

stable enzymes. 
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The a-lactalburoin gene is the preferred gene 
for use in the process because it is a mammary specific 
protein 5- control region. It also exerts the tightest 
lactational control of all milk proteins. Further, it is 
independently regulated from other milk proteins and is 

produced in large quantity by lactating animals. 

Total Seauenr!<a 

A gene encoding the milk protein bovine 
a-lactalbumin was isolated from a bovine genomic library 
(Woychik, 1982). The Charon 28 lambda library was pro^d 
u^ing a bovine a-lactalbumin cDNA (Hurley, 1987) and a 
770 base pair a-lactalbumin polymerase chain reaction 
product. The positive lambda clone includes 12 5 
kilobases of inserted bovine sequence, consisting of 2.0 
kilobases of a 5- flanking (control/enhancer) region, a 
1.7 kilobase coding region and 8.8 kilobases of a 3 • ' 
flanking region. a partial restriction map of the clone 
IS illustrated in Fig. i. 

A 2.8 kilobase Hpa I fragment including the 
kilobase control region along with the signal peptide 
coding region was cloned into the EcoRV site of the 
Plasmid PIC 20R. The piasmid is illustrated in schematic 
outline in Fig. 2. 

An 8.0 kilobase Bgl II fragment containing a 
2.0 kilobase 5. flanking control region, a 1.7 kilobase 
coding region, 3.0 kilobases of a 3- flanking region, 1.2 
kilobases of a lambda DNA has also been isolated. 
Reference is made to figure 4 for a map of the 8.0 
kilobase fragment. Transgenic mice have been produced 
using the Bgl n fragment. 
Control /TgnhanePT- Reginn 

The 2.0 kilobase 5' flanking region has been 
cloned into the vectors Pic 20R and Bluescript KS+ a 
schematic illustration of the a-lactalbumin 5- flaiiking 
control region cloned in the EcoRV site of pic 20R is 
depicted in Figs. 2 and 3 (SEQ ID NO:l, SEQ ID NO- 2) 



20 2.0 
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The construct »s multiple cloning site, which 
exists downstream of the signal peptide coding region, 
permits various genes to be attached to the a-lactalbumin 
control region. Thus, this vector allows for easy 
5 attachment of specific coding sequences of genes. It 

contains all elements necessary for expression of 
proteins in milk, i.e., a mammary specific control 
region, a mammary specific signal peptide coding region 
and a mature protein-signal peptide splice site which is 
10 able to be cleaved in the mammary gland. The vector also 

contains many unique restriction enzyme sites for ease of 
cloning- Attachment of genes to this control region will 
allow for mammary expression of the genes when these 
constructs are placed into mammals. These vectors also 
15 contain the a-lactalbumin signal peptide coding sequence 

which will allow for proper transport of the expressed 
protein into the milk of the lactating mammal. 

The control region construct has driven 
mammary expression of a desired protein in transgenic 
20 mice. Bovine a-lactalbumin levels of greater than 1 

mg/ml have been observed in the milk of transgenic mouse 
lines as described in Example 2 (infra • ) • Constructs 
containing the 2.0 kilobase region attached to the bovine 
B-casein gene (Bonsing, J., et al., 1988) as well as the 
25 bacterial reporter gene chloramphenicol acetyl 

transferase have been produced in our lab. Fig. 6 is a 
schematic representation of a plasmid containing the 
bovine a-lactalbumin bovine B-casein gene construct. The 
genomic DNA sequence containing the bovine B-casein gene 
30 was attached to the 5' flanking sequence of the bovine a- 

lactalbumin 5» flanking sequence. The vector contains 
the polyadenylation site of 6-casein along with 
approximately 100 base pairs of 5* flanking DNA. The 100 
base pairs of 5' flanking DNA is attached to the bovine 
35 a-lactalbumin 5* flanking region at the -100 position. 

The construct uses the proximal promoter elements of B- 
casein and the distal control region elements of a- 
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lactalbumin* The B-casein construct has been used to 
produce transgenic mice as is illustrated in the 
examples. 

To understand the control of the 
5 control/ enhancer region the 2.0 kilobases of 5' flanking 

region were sequenced. A single strand copy of the 
sequence is listed in Fig. 5 (SEQ ID NO:3). The sequence 
is listed 5' to 3» with the signal peptide coding region 
underlined. 

10 Regulatory Sequences 

Potential regulatory sequences contained 
within the 5 '-flanking region of bovine a-lactalbumin 
have been identified. There are possible regulatory 
regions in the introns as well as in the 3 ' flanking 

15 region. Portions of the suspected control regions were 

examined- for possible sequence differences in the 
population which might be related to milk and milk 
protein production of individual cows. The differences 
in the regulatory regions of a-lactalbumin are expected 

20 to lead to differences in expression of a-lactalbumin 

mRNA. The increased cellular content of mRNA will 
increase the expression of a-lactalbumin protein with a 
concomitant increase in lactose synthase resulting, 
ultimately, in a milk and milk protein production 

25 increase. This type of mechanism would be considered a 

major gene effect on milk and milk protein production by 
a-lactalbumin. The changes are viewed as causally-linked 
to changes in milk and milk protein production and not 
correlatively-linked. Correlatively-linked traits are 

30 those which are closely associated with an unknown 

genetic loci which has the direct impact on the 
quantitative trait. 

Sequence differences between the U. S. 
Holstein and the French cow f Vilotte, et al. . 1987) of an 

35 unknown breed were found at four positions within the 5' 

flanking region. One of the identified sequences has a 
sequence which would indicate that 'it was a steroid 
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hormone response element. Two other differences were 
noted in the RNA polymerase binding region and a fourth 
in the signal peptide coding region of the gene. Because 
of the relationship between these sequences and known 
5 control sequences of mammalian genes, all the variations 

occur in regions one would expect to be involved in 
regulation of the amount of mRNA produced. Further, 
genetic variations which occur in factors binding to 
these regions would also be expected to cause changes. 

Fig. 7 illustrates sequence variants observed 
in the 5' flanking region between the present invention 
U. S. bovine, human (Hall et al. , 1987) and the French 
bovine (Vilotte, 1987) for the putative steroid response 
element (SEQ ID NO: 4, SEQ ID NO: 5 and SEQ ID NO: 6 

15 respectively) and for the RNA polymerase binding region 

(SEQ ID NO: 7, SEQ ID NO: 8, and SEQ ID NO: 9 respectively). 
All of the differences occur in highly conserved portions 
of the gene as seen by comparing this region to the same 
region of the human a-lactalbumin gene. Fig. 7 also 

20 shows that the positions where the bovine genes differ 

are the same positions the human gene differs from the 
bovine. These data indicate that the bases are part of a 
potentially important control region. 

A method has been devised to give a clearcut 

25 differentiation between two of the variants at a position 

-13 bases from the start of transcription, i. e. , 13 base 
positions from the signal peptide coding region. The two 
variants are termed (a-Lac (-13) A) and (a-Lac (-13) B) . 
The a-lac (-13) A genotype is adenine base at position 

30 -13 the a-lac (-13) B genotype is either a guanine, 

thymine or cytosine base at -13. They can be 
differentiated with a simple restriction enzyme digest of 
an amplified polymerase chain reaction (PCR) product 
using a specific restriction enzyme (Mnll) . Because of 

35 the specificity of the restriction enzyme Mnll, the 

restriction analysis is unable to distinguish between 
these different possibilities. The a-lac (-13) A allele 



wo 93/04165 



PCr/US92/06549 



-12- 

contains an extra Mnll site at position -13 giving the 
smaller band observed on the gel. 

To amplify the appropriate region of DNA, 
oligonucleotides which frame the sequence of interest 
5 were synthesized. These oligonucleotides were chosen 

because of their specific chemical characteristics. 
These oligonucleotides were then used in a polymerase 
chain reaction to amplify the framed portion of the 
a-lactalbumin gene. The oligonucleotides have the 
10 following sequences: 

a-lac Seq. 1 (SEQ ID NO: 10) 

5 • ACGCTTGTAAAACGACGGCCAGTTGATTCTCAGTCTCTGTGGT 3 • 
a-lac Seq. 2 (SEQ ID NO: 11) 

5 " AGCATCAGGAAACAGCTATGACCTGGGTGGCATGGAATAGGAT 3 ' 
^5 Restriction fragment analysis (Sambrook^ J. 

et al., 1989) was used to examine animals from a number 
of breeds of cattle. In most breeds, namely, Jersey, 
Guernsey, Brown Swiss, Simmental and Brahman, only one of 
two genotypes is found. This is the a-lac (-13) B 

2 0 genotype. However, in the most popular and highest milk 

producing breed of cattle, the Holstein, two genotypes 
occur at this position. The frequency of the A genotype 
was 27% in random samples, while the frequency of the B 
genotype was 73%. Holsteins contain both the genotype 
25 found in the other breeds as well as a separate distinct 

genotype which appears to have arisen within the last 
thirty years in the U. S. Holstein population as 
determined by examining pedigrees of sires currently in 
use. It appears that this genotype has unknowingly been 

3 0 selected for using traditional animal selection. 

Homozygous and heterozygous animals are found within the 
Holstein population. 

The genotype (a-lac (-13)) has been exeuained 
for its correlation with milk and milk protein 
35 production. The three additional variations are being 

examined to determine the frequency of their differences 
in the cattle population and their correlation with milk 
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and milk protein production. The possible linkage of 
these genotypes is also being examined using DNA 
sequencing. The goal of this technology is to identify 
the optimal regulatory genotype for a-lactalbvimin and to 
5 select animals with those particular characteristics. 

Detection and Selection of Four Genetic Variants 

The region of sequence where the a-lac (-13) 
variation occurs can be amplified using the polymerase 
chain reaction (PGR) (Sambrook et al., 1989) and two of 
10 the following primers which were developed. Each primer 

allows for amplification of a specific portion of the 
a-lactalbumin gene. Combinations of the listed primers 
can be used in between any two of the primer locations 
listed below. 

15 Primer No.\ Primer sequence Primer location 

(SEQ ID NO:) (From translation 

start site) 

1 \ (12) 5* CTCTTCCTGGATGTAAGGCTT 3' (-120) - (-100) 

2 \ (13) 5' TCCTGGGTGGTCATTGAAAGGACT 3 • (-2000) - (-1975) 
20 3 \ (14) 5* CAATGTGGTATCTGGCTATTTAGTG 3» (-717 ) - ( -692 ) 

4 \ (15) 5" AGCCTGGGTGGCATGGAATA 3' (+53 ) - (+3 3 ) 

5 \ (16) 5' GAAACGCGGTACAGACCCCT 3* (+453 ) - (+433 ) 

After amplification of the specific region, 
the DNA is either sequenced or digested with restriction 

25 enzymes to detect the sequence differences. In the case 

of the a-lac (-13) variation, the sequence difference can 
be seen using the restriction enzyme Mnll (5*CTCC 3* 
recognition site) . The PGR DNA product is digested with 
Hnll and then run on a 4% NuSieve agarose gel to observe 

30 the polymorphism. 

A 650 base pair sequence containing all four 
of the variations is being examined using a unique 
sequencing technique. PGR is initially used to amplify a 
770 base pair portion of the a-lactalb\amin 5» flanking 

35 region. Another PGR reaction is then performed using a 

portion of the initial reaction and the following primers 
(SEQ ID NO: 10 and SEQ ID NO: 11 respectively): 

a-lac seg. 1 5 • ACGCTTGTAAAACGACGGCCAGTTGATTCTCAGTCTCTGTGGT 3' 



wo 93/04165 



PCr/US92/06549 



-14- 

a-lac seq. 2 5 * AGCATCAGGAAACAGCTATGACCTGGGTGGCATGGAATAGGAT 3* 

The primers listed above contain a portion of 
the a-lactalbumin gene as well as both M13 DNA sequencing 
primers. The primers are designed to allow for DNA 
5 sequencing to be performed in both directions on the PGR 

DNA product. The final PGR product will contain the 
region of a-lactalbumin containing the four genetic 
variants, the two M13 sequencing priming regions and 5 
"dummy bases" on the end to aid in the M13 primer 
10 binding. 

Comparison of Hiahlv Conserved Portions of the 5' 
Flanking Region of a-Lactalbumin Between Species 

Reference is made to Figs. 8-10 for 
DOTBLOT" graphs comparing the bovine a-lactalbumin 
15 sequence to the same region of the human (Fig. 8) , guinea 

pig (Fig. 9), and rat (Fig. 10). The region in Fig. 8 
(human) spans 819 base pairs. The sequences are highly 
conserved to about 700 base pairs. The region in Fiq. 9 
(guinea pig) spans 13 81 base pairs. The sequences are 
20 highly conserved to about 700 base pairs, but then 

diverge. The region in Fig. 10 (rat) spans 1337 base 
pairs. The sequences are highly conserved to about 700 
base pairs, but then diverge. Species differences in 
control regions would be expected to occur in non- 
25 conserved regions of the sequence. 

Comparison of 5' Flanking Region of Bovine a-Lactalbumin 
to Other Bovine Milk Protein Genes 

Portions of the 5' flanking region of the 
other bovine milk protein genes (asl and as2 casein, 
3 0 B-casein, iC-casein and B-lactoglobulin) which are highly 

conserved with the a-lactalbumin 5* flanking region were 
identified. It is probable that sequence differences 
within these regions will also have an effect on mRNA 
production as well as final protein production. Two 
35 examples of these highly homologous regions are listed 

below. 

The bovine a-lactalbumin sequence from (-161) 
- (-115) (SEQ ID NO: 17) compared to the bovine B-casei^ 



wo 93/04165 



PCT/US92/06549 



-15- 

sequence (SEQ ID NO: 18) corresponding to the same region 

of the gene. Percent similarity is 69% over 4 6 bases. 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA 
! ( * I III I I till It t I I I I I I I t t I II lit 
I I I I II I 11 I I I I i I I I I i i t I III 

5 AGG AGG CT . ATTCTTTCCTTTTAGTCTATACTGTCTTCGCTCTTCA 

The bovine a-lactalbumin sequence (SEQ ID 
NO: 19) from (-1420) - (-1351) is compared to the bovine 
fl-casein sequence (SEQ ID NO: 20) corresponding to the 
scime region of the gene. Percent similarity is 75% over 
10 69 bases. 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCA 
I I I I I I I I I I I I I I lilt I I t I t I I I 

t I I t I I I I I I till till t t I I I I t I 

TCTCAGAAATCACACTTTTTTGCCTGTG GCCTTGGCA 



15 I t I I I I I I 1 I I t I I I 

'^^ I I I I I I I I 1 I I I I II 



TACCAGAAGCTAACAGCTA 

III! I I I I 1 I I t I II 
till I 1 I I I I I I I II 

. ACCAAAAGCTAACACATA 



The included data indicate that the bovine 
a-lactalbumin gene will be useful as selection tool in 
the dairy cattle industry as well as a valuable 

20 control /enhancer and gene to be used in the field of 

genetically engineered mammals. The control region we 
have cloned contains the necessary regulatory elements to 
express genes in the milk of genetically engineered 
mammals as well ati the "high expressing genotype" as 

2 5 shown by our milk and milk protein production and 

sequence variation data. These facts make this a useful 
gene in both industrial and research areas. Application 
of these techniques to the other milk proteins will allow 
for the selection of valuable genotypes corresponding to 

30 the B-casein, as^^- and as2-casein and JC-casein genes and 

the B-lactoglobulin genes. 
Coding Region 

The coding region of the a-lactalbumin 
protein includes a 1.7 kilobase sequence. 

35 3 * Flanking Region 

The 3" flanking region is an 8.8 kilobase 
flanking region downstream of the DNA sequence coding for 
the desired recombinant protein. This region apparently 
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stabilizes the RNA transcript of the expression system 
and thus increases the yield of desired protein from the 
expression system. 
Operation 

5 The above-described expression systems may be 

prepared by methods well-known in the art. Examples 
include various ligation techniques employing 
conventional linkers, restriction sites, etc. 
Preferably, these expression systems are part of larger 
10 plasmids. 

After isolation and purification, the 
expression systems or constructs are added to the gene 
pool which is to be genetically altered. 

The methods for genetically engineering 

15 mammals are well-known to the art. Reference is made to 

to Alberts, B. et al., 1989 and Lewin, B. 1990, for 
textbook descriptions of genetic engineering and 
transgenic alteration of animals. Briefly, genetic 
engineering involves the construction of expression 

20 vectors so that a cDNA clone or genomic structure is 

connected directly to a DNA sequence that acts as a 
strong promoter for DNA transcription. By means of 
genetic engineering, mammalian cells, such as mammary 
tissue, can be induced to make vast quantities of useful 

25 proteins. 

For the purposes of this invention, the tejrm 
"genetic engineering," as defined supra . in the list of 
definitions, includes single line alteration, i. e. , 
genetic alteration only during the life of the affected 

3 0 animal with no germ line permanence. The construct can 

be genetically incorporated in mammalian glands such as 
mammary glands and mammalian stem cells. 

Genetic engineering also includes transgenic 
alteration, i. e. the permanent insertion of the gene 

35 sequence into the genomic structure of the affected 

animal and any offspring. Transgenically altering a 
mammal involves microinjecting a DNA construct into the 
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pronuclei of the fertilized mammalian egg to cause one or 
more copies of the construct to be retained in the cells 
of the developing mammal. In a transgenic animal, the 
engineered genes are permanently inserted into the germ 
5 line of the animal. 

The genetically engineered mammal is then 
characterized by an expression system comprising the a- 
lactalbumin control region operatively linked to an 
exogenous DNA sequence coding for the recombinant protein 

10 through a DNA sequence coding for a signal peptide 

effective in secreting and maturing the recombinant 
protein in mammary tissue. In order to produce and 
secrete the recombinant protein into the mammal's milk, 
the transgenic mammal must be allowed to produce the 

15 milk, after which the milk is collected. The milk may 

then be used in standard manufacturing processes. The 
exogenous recombinant protein may also be isolated from 
the milk according to methods known to the art. 
Selection Characteristics 

20 The a-lactalbumin control/enhancer sequence 

of Fig. 1 is also important as a selection characteristic 
for identifying superior or elite milk producing mammals. 
Presently, those in the dairy cattle business can only 
rely on pedigree information, which is frequently not 

25 available, to predict milk and milk protein production in 

mammals, specifically the bovine species. The study of 
physiological markers as a means for determining milk and 
milk protein production has received some interest. The 
most common physiological marker traits studied in dairy 

30 cattle are hormones, enzymes, and different blood 

metabolites. Components of the immune system have also 
been studied. Traits listed as possible marker traits 
for milk yield include thyroxine, blood urea nitrogen, 
growth hormones, insulin-like growth factors and insulin, 

35 and glucose and free fatty acids. While these techniques 

have shown some advances in predicting milk and milk 
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protein production in a dairy animal , there is currently 
no other reliable means to predict these characteristics. 

The present invention provides a selection 
characteristic for identifying superior milk and milk 
5 protein-producing mammals comprising inherited genetic 

material which is DNA occurring in the genetic structure 
of the mammal in which the genetic material encodes a 
dominant selectable marker for bovine a-lactalbumin. 

The DNA sequence disclosed herein serves as a 
10 characteristic marker for elite milk producing mammals. 

The examples below describe the invention 
disclosed herein, although the invention is not to be 
understood as limited in any way to the terms and scope 
of the examples. 
* 15 EXAMPLES 

Example 1: a-lac (-13) variation study. 

Forty-two mammals were selected in a 
stratified random manner tp provide mammals of a wide 
range of milk and milk protein production capabilities 
2 0 within the UW herd. 

DNA was isolated according to procedures 
known to the art from a random sample of 4 2 Holstein 
dairy cows in the University of Wisconsin-Madison herd. 
Each mammal was genotyped as described previously for the 
25 a-lactalbumin (-13) variation using a 4% NuSieve gel of 

Mnll digested PGR products. 

The gene frequency in this population is 28% 
for the a-lac (-13) A and 72% for the a-lac (-13) B. 
Each of the distinct genotypes are shown on the gel in 
30 Fig- 12. The legend for the gel of Figure 12 is as 

follows: 

Lane 1 Molecular Weight Standards 

Lane 2-3 heterozygous a-lac (-13) AB 

Lane 4: homozygous . a-lac (-13) BE 

35 Lane 5 heterozygous a-lac (-13) AB 

Lane 6 homozygous a-lac (-13) BB 

Lane 7 homozygous a-lac (-13) AA 
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Lane 8 heterozygous a-lac (-13) AB 

Analysis of the genetic capabilities of the 
4 2 mammals indicates a possible major gene effect caused 
by the a-lac (-13) allele or linked to the a-lac (-13) 
5 allele. A scatter plot of each data point as well as 

mean values for each of the three genotypes is 
illustrated in Fig. 13. Holstein cows were compared 
using their predicted transmitting ability for milk- 
The data indicate that the a-lac (-13) A 
10 genotype is the preferred genotype for milk and milk 

protein production. Table 1 shown below indicates the 
statistical association of differences in milk and milk 
protein production ability observed between each of the 
genotypes for the traits listed below. Analysis of 
15 variance and T tests (LSD) were performed on the data. 

All of the production yield traits were positively 
correlated with the a-lac (-13) A allele. Milk protein 
percentage was negatively correlated to the a-lac (-13) A 
allele. 

20 Table 1 

Trait/Genotype Genotype 

a-Lac (-13) AA a-lac (-13) AB a-lac (-13) BB 

PTA (Milk) /AA N.S. p<0.02 

PTA (Milk)/AB N.S. p<0.02 

,25 ME305 {Milk)/AA N.S. N.S. 

ME305 (Milk)/AB N.S. p<0.1 

PTA (Protein #)/AA N.S. N.S. 

PTA (Protein #)/AB N.S. p<0.l 

PTA (Protein %)/AA N.S. p<0.01 

30 PTA (Protein %)/AB N.S. p<0.01 

Example 2. Production of Transgenic mice to study the 
regulation of bovine a-lactalbumin gene expression. 
Genomic Library Screening; 

The gene encoding the milk protein bovine 
35 Q-lactalbumin was isolated from a bovine genomic library 

(Woychik, 1982) . The genomic library was screened 
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aocordin, to the following procedure. Approximately l s 
-.llxon lambda plagues were transferred to nylon 

(1989) . The a-lactalbumin cDNA (Hurley. 1987, or a 770 

then .......ed^ri- rTrr^ltTre"^ 

washed (Twice in 2X ssc i, SDS, Once in o.ix ssc 0 l^ 
SDS, at 65C and placed on Kodak X-OMAT film for 

a-irita^'r"^"''- ^ "'"""^ containing the 

The 8 0 LT'r ""'"'^'^ " illustrated in p^. J. 

TlLl: frag^.nt contained 3.1 kilobases of s- 

kilobases Of 3' flanking region. 
Produ<7<-.v,„ -r- insfi^n.-^ — ■ 

Mature C57B6 X DBA2J Fl (8602) female were 
superovulated (P„sc and he, and mated with ICR or B6D2 
Bales to yield fertilised eggs for pronuclear 
microinjection. The eggs were microinjected using a 
F^rtv - Ni-^on inverted microscope. 

rreacr" Z^'"'^"' ^-"^^-^ transferred 

to each pseudopregnant recipient. (University of 

ra:rutv'"T'''°" Biotechnology center Transgenic Mouse 

Facility, Dr. Jan Heideman) . 

Screening of n,i ^pp. 

Tail DNA was extracted using the method 
described by Constantini et al fi986l o«i 
reaction /-rjno. ax. (1986). Polymerase chain 

(Pharmacia Intl.. Milwaukee, „i.,, ^^^^ 
(upstream primer 25mer -712 to -687 (5- 

p'^eT2T'"'°"*"'*'^° '•°="'' ^o«,stream 

primer 20»er «9 to +59 (5- AGCCTGGGTGGCATGGAATA 3-) (SEQ 
KO:15,. 1 unit Tag OKA polymerase (Promega Corp 

to ^'oT',""'' ""^ ^"""-"^^ ^"l'™^ adjusted 

to 100 ml With double distilled sterile water and 

reaction was overlaid with heavy mineral oil. samples 
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were subjected to 30 cycles (94C 2 min. , 50C 1.5 min. , 
72C 1.5 min.)* Products were run in an 1% agarose gel 
and stained with ethidium bromide. 
Mouse Milking: 

5 The mice were separated from their litters 

for four hours and then anesthetized (0.01 ml/g body 
weight I. P. injection of 3 6% propylene glycol, 10.5% 
ethyl alcohol (95%), 41.5% sterile water, and 12% sodium 
pentabarbitol (50 mg/ml) ) . After being anesthetized the 

10 mice were injected I.M. with 0.3 I.U. oxytocin and milked 

using a small vacuum milking machine. Three of fifty-one 
live offspring were identified as being transgenic using 
polymerase chain reaction. Reference is made to Fig. 14 
for a graph illustrating expression levels observed in 

15 each of the 3 a-lactalbumin transgenic mouse line. 

ELISA: 

Second generation mammals from one line were 
milked and analysis was performed using an ELISA (enzyme 
linked immunosorbent assay) for bovine a-lactalbumin 
20 according to the following procedure: 

1. Coat 1/4 Ok bovine a-lactalbumin 
antiserum 100 ml per well (in 0.05M carbonate buffer, pH 
9.6) on Nunc-Immuno Plate IF MaxiSorp. 

2. Wash 4x with wash buffer (0.025% Tween 
25 20 in PBS pH 7.2) 

3. Add 50 ml assay buffer (0.04M MOPS, 
0.12M NaCl, O.OIM EDTA, 0.1% gelatin, 0.05% Tween 20, 
0.005% chlorhexidine digluconate, Leupeptin 50 mg/ml, pH 
7.4). 

30 4. Add 50 ml of standards and samples (in 

assay buffer) in triplicate. 

5. Add 50 ml 1/lOOk diluted a-lactalbumin 
biotin conjugate. 

6. Incubate overnight at 4C 
35 7. Wash 4x with wash buffer 

8. Add 100 ml 1/lOk assay buffer diluted 
ExtrAvidin-peroxidase (Sigma) . Inctibate 2 hours at RT. 
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9. Wash 4x twice with wash buffer • 

10. Add 125 ml fresh substrate buffer (200 
ml tetramethylbenzidine 20 mg/ml) DMSO, 64 ml 0.5M 
hydrogen peroxide, 19.74 ml sodium acetate, pH 4 •8). 

5 11. Incubate for 12 minutes at RT. 

12. Add 50 ml 0.5M sulfuric acid to stop 
substrate reaction. 

13. Read absorbance at 450 nm minus 600 nm 
in an EIA autoreader. 

-3-0 Bovine a-lactalbumin was present at a 

concentration of levels up to and beyond 1.0 mg/ml mouse 
milk. Expression was determined by Western Blotting in 
the following steps. The 14% PAGE gel was transfered to an 
Immobilon-P membrane (Millipore) , which was blocked in 
15 . 0.02 M sodiumphosphate, 0.12M NaCl, 0.01% gelatin, 0.05% 
Tween 20/ pH=7 . 2 , and incubated with anti-bovine 
a-lactalbumin (1/2000 dilution) for 2 hours at room 
temperature. The gel was washed twice (2 min.) with an 
ELISA wash buffer and incubated with goat anti-rabbit 
20 IgG-HRP for 2 hours at room temperature, followed by 

washing 3 tiroes with a wash buffer and washing once with 
double-distilled water. The gel was placed in a substrate 
solution (25 mg 3 , 3 • -diaminobenzidine , 1 ml 1% C0CI2 
H2O, 49 ml PBS pH 7.4 and 0.05 ml 30% H2O2) and monitored 
25 for color development. The membrane was air dried. 

It is understood that the invention is not 
confined to the particular constructions and arrangements 
herein illustrated and described, but embraces such 
modified forms thereof as come within the scope of the 
3 0 claims following the bibliography. 
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SEQUENCE LISTING 

(1) GENERAI. INFORMATION: 

(i) APPLICANT: BLECK, GREGORY T. 

BREMEL, ROBERT D. 

5' (ii) TITLE OF INVENTION: DNA SEQUENCE ENCODING BOVINE 

ALPHA-LACTALBUMIN AND METHODS OF USE 

(lii) NUMBER OF SEQUENCES: 20 

(iv) CORRESPONDENCE ADDRESS: 
10 (A) ADDRESSEE: ANDRUS, SCEALES , STARKE & SAWALL 

(B) STREET: 100 E. WISCONSIN AVE, , SUITE 1100 

(C) CITY: MILWAUKEE 

(D) STATE: WI 

(E) COUNTRY: USA 

15 (F) ZIP: 53202-4178 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

20 (D) SOFTWARE: Patentin Release #1,0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

25 (viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Sara, Charles S 

(B) REGISTRATION NUMBER: 30,492 

<C) REFERENCE /DOCKET NUMBER: F. 3262-1 

(ix) TELECOMMUNICATION INFORMATION: 
30 (A) TELEPHONE: (608) 255-2022 

(B) TELEFAX: (608) 255-2182 

(C) TELEX: 26832 ANDSTARK 

(2) INFORMATION FOR SEQ ID NOsl: 

(1) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS t single 

(D) TOPOLOGY s linear 

(ii) MOLECULE TYPE: DNA (genomic) 



40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

ATGACCATGA TTACGAATTC ATCGTA 26 

(2) XKFORUATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 88 base pairs 
45 (B) TYPE: nucleic acid 

<C) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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(ii> MOI^CULE TYPE: DNA (genomic, 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:2. 
GAACAGTTAT CTAGATCTCG AGCTCGCGAA AGCTTGCATG CCTr... 

CATCCCCCGG .ACCGAGCTC GAAXXCAC " '^^"""'^^^ «° 

5 (2; INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS- 

(ix) FEATURE: 
(ix) FEATURE: 
^ix) FEATURE: 
(ix) FEATURE: 

SEOOEHCE DESCRIPTION, SCO ID HO-J- 
CATCACTCCT C<;CT<:=TC„ TGA««»Cr «TCCTCMC TTC«,=.. 

rr ~ ~ — = 

ccTccc^; — - 

CATCOTA«= «X«AOOAT «««OAC= «C„tT«C OCTC.ITCTO 

lACCCAOICC MTAIAACAA ATr... AAITATITAC TTAOGATMG 

«TACCAOAA j» ""-"TT AOA.ACTOAT OTAOAGAGAA TCA=CC=T=C 

™» c«2r" """""" ™« — « 
=^-.r: it':^! 

«=.c.cA« ^.Ax^: -™ 

caxcxT: ™: ~ ~ 
— — — ~ ~ ~ 



60 
120 
160 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
640 
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AGTCCATGGG ATTGCAAAGA GTTGAACACA ACTGAGCAAC TAAGCACAGC ACAGTACAGT 900 

ATACACCTGT GAGGTGAAGT GAAGTGAAGG TTCAATGCAG GGTCTCCTGC ATTGCAGAAA 960 

GATTCTTTAC CATCTGAGCC ACCAGGGAAG CCCAAGAATA CTGGAGTGGG TAGCCTATTC 1020 

CTTCTCCAGG GGATCTTCCC ATCCCAGGAA TTGAACTGGA GTCTCCTGCA TTTCAGGTGG 1080 

ATTCTTCACC AGCTGAACTA CCAGGTGGAT ACTACTCCAA TATTAAAGTG CTTAAAGTCC 1140 

AGTTTTCCCA CCTTTCCCAA AAAGGTTGGG TCACTCTTTT TTAACCTTCT GTGGCCTACT 1200 

CTGAGGCTGT CTACAAGCTT ATATATTTAT GAACACATTT ATTGCAAGTT GTTAGTTTTA 1260 

GATTTACAAT GTGGTATCTG GCTATTTAGT GGTATTGGTG GTTGGGGATG GGGAGGCTGA 1320 

TAGCATCTCA GAGGGCAGCT AGATACTGTC ATACACACTT TTCAAGTTCT CCATTTTTGT 1380 

GAAATAGAAA GTCTCTGGAT CTAAGTTATA TGTGATTCTC AGTCTCTGTG GTCATATTCT 1440 

ATTCTACTCC TGACCACTCA ACAAGGAACC AAGATATCAA GGGACACTTG TTTTGTTTCA 1500 

TGCCTGGGTT GAGTGGGCCA TGACATATGA TGATGTACAG TCCTTTTCCA TATTCTGTAT 1560 

GTCTCTAAGA GGAAGGAGGA GTTGGCCGTG GACCCTTTGT GCATTTTCTG ATTGCTTCAC 1620 

TTGTATTACC CCTGAGGCCC CCTTTGTTCC TGAAATAGGT TGGGCACATC TTGCTTCCTA 1680 

GAACCAACAC TACCAGAAAC AACATAAATA AAGCCAAATG GGAAACAGGA TCATGTTTGT 1740 

AACACTCTTT GGGCAGGTAA CAATACCTAG TATGGACTAG AGATTCTGGG GAGGAAAGGA 1800 

AAAGTGGGGT GAAATTACTG AAGGAAGCTC AATGTTTCTT TGTTGGTTTT ACTGGCCTCT 1860 

CTTGTCATCC TCTTCCTGGA TGTAAGGCTT GATGCCAGGG CCCCTAAGGC TTTTTCCACA 1920 

AATAAAAGGA GGTGAGCAGT GTGGTGACCC CATTTCAGAA TCTTGAGGGG TAACCAAAAT 1980 

GATGTCCTTT GTCTCTCTGC TCCTGGTAGG CATCCTATTC CATGCCACCC AGGCTGAACA 2040 

GTTA 2044 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 14 base pairs 
(B> TyPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
CATATTCTAT TCTA 14 
(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 
CATATTCTAT TCCTA 

(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
CATATTCTAT TTCTA 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 7: 
TCTTGAGGGG TAACCAAA 
(2) INFORMATION FOR SEQ ID NO: 8: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
TCTTGGGGGT AGCCAAA 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
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TCTTCCGGGG TCACCAAA 

(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA- (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
ACGCTTGTAA AACGACGGCC AGTTGATTCT CAGTCTCTGT GGT 43 
(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 43 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS i single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
AGCATCAGGA AACAGCTATG ACCTGGGTGG CATGGAATAG GAT 43 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CTCTTCCTCG ATGTAAGGCT T 
(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13; 
TCCTGGGTGG TCATTGAAAG GACT 
(2) INFORMATION FOR SEQ ID NO: 14: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDED NESS s single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
CAATGTGGTA TCTGGCTATT TAGTG 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 
AGCCTGGGTG GCATGGAATA 
(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GAAACGCGGT ACAGACCCCT 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
AGGAAGCTCA ATGTTTCTTT GTTGGTTTTA CTGGCCTCTC TTGTCA 46 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(11) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 
AGGAGGCTAT TCTTTCCTTT TAGTCTATAC TGTCTTCGCT CTTCA 
(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 
TATAAGAAAT CAGGCTTTAG AGACTGATGT AGAGAGAATG AGCCCTGGCA TACCAGAAGC 
TAACAGCTA 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 55 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
TCTCAGAAAT CACACTTTTT TGCCTGTGGC CTTGGCAACC AAAAGCTAAC ACATA 
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CLAIMS 
What is claimed is: 
1. A mairunary specific DNA sequence encoding 

bovine a-lactalbumin and promoting quantitative 
differences in gene expression among mammals, wherein the 
DNA sequence is characterized by variations in the gene 
5 structure in the control region of bovine a-lactalbumin. 

2^ The DNA sequence of claim 1 wherein one of 

the variations is in the -13 position of the DNA 
sequence. 

3^ The DNA sequence of claim 2 wherein the -13 

position is occupied by adenine. 

4 , The DNA sequence of claim 1 comprising the 
following DNA sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 

5 AGCTAACAGCTA . 

5^ The DNA sequence of claim 1 comprising the 

following DNA sequence (SEQ ID NO: 17) in the control 
region of bovine a-lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA. 

5. The DNA sequence of claim 1 comprising the 
DNA sequence listed in Fig. 5 (SEQ ID NO: 3) in the 
control region of bovine a-lactalbumin. 

7, An expression vector comprising the DNA 
sequence of claim 1. 

8, An expression system comprising a ma mm a r y 
specific a-lactalbumin control region construct which 
when genetically incorporated into a mammal permits the 
female species of that mammal to produce the desired 

5 recombinant protein in its milk. 

9, The expression system of claim 8 which 
comprises at least one a-lactalbumin control region 
construct operatively linked to a DNA sequence coding for 
a signal peptide and a-lactalbumin. 
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10. The expression system of clairo 8 which 
comprises a 3 • flanking region downstream of the DNA 
sequence coding for a-lactalbumin. 

11. The expression system of claim 10 wherein the 
construct includes a 5* flanking region upstream of the 
DNA sequence coding for the signal peptide. 

12. The expression system of claim 8 wherein the 
construct comprises a 5' a-lactalbumin flanking region 
attached to a bovine B-casein gene. 

13. The expression system of claim 12 wherein the 
construct contains the polyadenylation site of B-casein 
and approximately 100 base pairs of 5* a-lactalbumin 
flanking region. 

14. The expression system of claim 12 wherein the 
construct includes the proximal promoter from a first 
milk protein and the distal control region of a second 
milk protein. 

15. The expression system of claim 14 wherein the 
first and second milk proteins are selected from the 
group consisting of a-lactalbumin, B-casein, asj^-casein, 
as2-casein and iC-casein. 

16. The expression system of claim 12 wherein the 
construct includes the proximal promoter of B-casein and 
the distal control region of a-lactalbumin. 

17. A genetically engineered mammal characterized 
by an expression system comprising an a-lactalbumin 
control region operatively linked to an exogenous DNA 
sequence coding for a desired protein to be expressed in 

^ milk through a DNA sequence coding for a signal peptide 

effective in secreting and maturing the protein in 
mcunmary tissue. 

18. The genetically engineered mammal of claim 17 
wherein the a-lactalbumin control region includes a 
mcunmary specific DNA sequence encoding bovine a- 
lactalbumin and having the following nucleotide sequence 

5 (SEQ ID NO: 19) in the control region of bovine a- 

lactalbumin: 
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TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 
AGCTAACAGCTA. 

19. The genetically engineered mammal of claim 17 
wherein the a-lactalbumin control region includes a 
mammary specific DNA sequence encoding bovine a- 
lactalbumin and having the following nucleotide sequence 

5 (SEQ ID NO: 20) in the control region of bovine a- 

lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA. 

20. Products produced by the genetically 
engineered mammal of claim 17. 

21. Semen produced by the genetically engineered 
mammal of claim 17. 

22. Milk produced by the genetically engineered 
mammal of claim 17 . 

23. - A transgenic mammal of claim 17. 

24. A DNA sequence coding for a-lactalbumin which 
is operatively linked in an expression system of a 
mammary specific a-lactalbumin protein control region, or 
any control region which specifically activates a- 

5 lactalbumin in milk or in mammary tissue, through a 

signal peptide that permits secretion and maturation of 
the a-lactalbumin in the mammary tissue. 

25. The DNA sequence of claim 24 comprising the 
following DNA sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 

5 AGCTAACAGCTA- 

26. The DNA sequence of claim 24 comprising the 
following DNA sequence (SEQ ID NO: 20) in the control 
region of bovine a-lactalbumin: 

AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA. 

27. The DNA sequence of claim 24 comprising the 
DNA sequence (SEQ ID NO: 3) listed in Fig. 5 in the 
control region of bovine a-lactalburain. 

28. A process for genetically engineering the 
incorporation of one or more copies of a construct 
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comprising a a-lactalbumin control region or any control 
region sequence specifically activated in mammary tissue, 
operatively linked to a DNA sequence coding for a desired 
recombinant protein through a DNA sequence coding for a 
signal peptide that permits the secretion and maturation 
of a-lactalbumin in the mammary tissue 

29. The process of claim 28 wherein the construct 
is genetically incorporated into a mammal and the 
recombinant protein product is subsequently expressed and 
secreted into or along with the milk of the lactating 
genetically engineered mammal. 

30. The process of claim 29 wherein the construct 
is generally incorporated in mammalian embryos, mammalian 
mammary glands, or mammalian stem cells. 

31. The process of claim 28 wherein the mammals 
are cows; sheep, goats, mice, oxen, camels, water 
buffaloes, llamas and pigs. 

32. A process for the production and secretion 
into mammal ' s milk of an exogenous recombinant protein 
comprising the steps of: 

a. producing milk in a genetically 
engineered mammal characterized by an 
expression system comprising a-lactalbumin 
control region operatively linked to an 
exogenous DNA sequence coding for the 
recombinant protein through a DNA sequence 
coding for a signal peptide effective in 
secreting and maturing the recombinant 
protein in mammary tissue; 

b. collecting the milk; and 

c. isolating the exogenous recombinant 
15 protein from the milk. 

33. The process according to claim 32, wherein 
said expression system also includes a 3 ' flanking region 
coding for a-lactalbumin downstream of the DNA sequence. 

34. The process according to claim 32, wherein 
said expression system also includes a 5* flanking region 
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coding for the signal peptide upstream of the DNA 
sequence. 

35. A selection characteristic for identifying 

superior milk producing mammals comprising a DNA sequence 
encoding bovine a-lactalbumin and having the DNA sequence 
(SEQ ID NO: 3) listed in Fig. 5 in the control region of 

5 bovine a-lactalbumin . 

36- A selection characteristic for identifying 

superior milk producing mammals comprising inherited 
genetic material which is a mammary specific DNA sequence 
encoding bovine a-lactalbumin and promoting quantitative 

5 differences in gene expression among mammals, wherein the 

DNA sequence is characterized by variations in the gene 
structure in the control region of bovine a-lactalbumin. 

37. The selection characteristic of claim 36 
wherein one of the variations is in the -13 position of 
the DNA sequence. 

38. The selection characteristic of claim 37 
wherein the -13 position is occupied by adenosine. 

39. The selection characteristic of claim 3 6 
comprising the following DNA sequence (SEQ ID NO: 19) in 
the control region of bovine a-lactalbumin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGA 

5 AG CT AAC AG CT A . 

40. The selection characteristic of claim 36 
comprising the following DNA sequence (SEQ ID NO: 20) in 
the control region of bovine a-lactalbumin: 
AGGAAGCTCAATGTTTCTTTGTTGGTTTTACTGGCCTCTCTTGTCA • 

41. The selection characteristic of claim 36 
comprising the DNA sequence (SEQ ID NO: 3) listed in Fig. 
5 in the control region of bovine a-lactalbumin. 

42. A method of predicting superior milk and milk 
protein production in mammals comprising comparing 
selected positions on the DNA sequence of the inherited 
control region for a-lactalbumin in a subject mammaml 

5 with analogous positions on the DNA sequence of the 
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control region for a-lactalbumin from manunals known for 
superior milk and milk protein production. 

43. The method of claim 42 wherein one of the 
selected positions is the -13 position on the control 
region of the DNA sequence. 

44. The method of claim 4 3 wherein the -13 
position is occupied by the base adenine. 

45. The method of claim 42 wherein the selected 
DNA sequence comprises a steroid response element. 
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12. The expression system of claim 8 wherein the 
construct comprises a 5' a-lactalbimin flanking region attached 
to a bovine £-casein DNA sequence* 

13. The expression system of claim 12 wherein the 
construct contains the polyadenylation site of B-casein 5' 
flanking region. 

14. The expression system of claim 12 wherein the 
construct includes the proximal promoter from a first milk 
protein and the distal control region of a second milk protein. 

15. The expression system of claim 14 wherein the 
first and second milk proteins are selected from the group 
consisting of a-lactalbumin, fl-lactoglobulin, B-casein, asj- 
casein, as2-*casein and JT-casein. 

16. The expressions system of claim 12 wherein the 
construct includes the proximal promoter of B-casein and the 
distal control region of a-lactalbumin. 

17. A transgenic non«*human mammal containing an 
expression system comprising the DNA sequence listed in Fig. 5 
(SEQ ID NO: 3} as the a-lactalbumin control region operatively 
linked to an exogenous DNA sequence coding for a desired protein 
to be expressed in milk through a DNA sequence coding for a 
signal peptide effective in secreting and maturing the protein 
in maunmary tissue. 

18. The transgenic non-hiunan mammal of claim 17 
wherein the a-lactalbumin control region includes a mammary 
specific DNA sequence encoding bovine a*lactalb\imin and having 
^e following nucleotide sequence (SEQ ID NO: 19) in the control 
region of bovine a-lactalbiimin: 

TATAAGAAATCAGGCTTTAGAGACTGATGTAGAGAGAATGAGCCCTGGCATACCAGAA6CTA 
ACAGCTA. 
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32. A process for the production and secretion into 
non-human mammal's milk of an exogenous recombinant protein 
comprising the steps of: 

a. producing milk in a genetically engineered meunmal 
containing an expression system comprising a 
m2Lmmary specific ce-lactalbumin control region 
operatively linked to a DNA sequence coding for 
a signal peptide, which is effective in secreting 
and maturing the recombinant protein in mammary 
tissue ; 

b. collecting the milk; and 

c. isolating the exogenous recombinant protein from 
the milk. 

34. The process according to claim 32, wherein said 
expression system also includes a 5' flanking region coding for 
the signal peptide upstream of the DNA sequence. . 
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